rank | frequency | n-gram |
---|---|---|
1 | 3684 | -í |
2 | 3386 | -u |
3 | 3323 | -a |
4 | 2813 | -e |
5 | 2566 | -i |
rank | frequency | n-gram |
---|---|---|
1 | 1601 | -ní |
2 | 1575 | -ch |
3 | 1133 | -ou |
4 | 1012 | -la |
5 | 930 | -ho |
rank | frequency | n-gram |
---|---|---|
1 | 689 | -ých |
2 | 567 | -ého |
3 | 505 | -ích |
4 | 405 | -ová |
5 | 366 | -nou |
rank | frequency | n-gram |
---|---|---|
1 | 279 | -ovat |
2 | 274 | -ných |
3 | 233 | -ních |
4 | 206 | -ního |
5 | 202 | -ného |
rank | frequency | n-gram |
---|---|---|
1 | 152 | -ování |
2 | 122 | -ovala |
3 | 118 | -ových |
4 | 114 | -ského |
5 | 111 | -ských |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings